Automatic Fluency Assessment Method for Spontaneous Speech without Reference Text

نویسندگان

چکیده

The automatic fluency assessment of spontaneous speech without reference text is a challenging task that heavily depends on the accuracy recognition (ASR). Considering this scenario, it necessary to explore an method combines ASR. This mainly due fact in addition acoustic features being essential for assessment, output by ASR may also contain potentially information. However, most existing studies are based solely audio features, utilizing textual information, which lead limited understanding features. To address this, we propose multimodal output. Specifically, first relevance and fine-tune Wav2Vec2.0 model using multi-task learning jointly optimize task, resulting both results Then, obtained from fine-tuned fed into model, attention mechanisms obtain more reliable results. Finally, experiments PSCPSF Speechocean762 dataset suggest our proposed performs well different scenarios.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Fluency Assessment by Signal-Level Measurement of Spontaneous Speech

In its narrow sense, the term fluency connotes fluidity of speech. This study is a step in the quest for objective language assessment methods one of which is rating for oral fluency in a second language. In particular, we seek to find what measures obtained from a spontaneous utterance can be used as predictors of fluency and, to assess the utility of a set of acoustic measures obtained by sig...

متن کامل

Improving oral reading fluency assessment using automatic speech processing technologies

Using speech processing technologies with third-grade readers, several new measures of reading rate were developed that involved measuring reading speed of individual words within the context of continuous text. Scores from one test form consisting of three passages were correlated with scores from another test form with three different passages to evaluate each measure’s reliability. Correlati...

متن کامل

An Empirical Text Tranformation Method for Spontaneous Speech Synthesizers

The Fillers Inventory and Models block expands this scope of conventional text analysis. It facilitates the inclusion of spontaneous speech features, which would be later synthesized, during waveform generation. It is assumed during text analysis that the corresponding inventory of spontaneous speech features is available for waveform generation. The required Text Analysis/transformation perfor...

متن کامل

An empirical text transformation method for spontaneous speech synthesizers

The Fillers Inventory and Models block expands this scope of conventional text analysis. It facilitates the inclusion of spontaneous speech features, which would be later synthesized, during waveform generation. It is assumed during text analysis that the corresponding inventory of spontaneous speech features is available for waveform generation. The required Text Analysis/transformation perfor...

متن کامل

Linguistic Resources for Reconstructing Spontaneous Speech Text

The output of a speech recognition system is not always ideal for subsequent downstream processing, in part because speakers themselves often make mistakes. A system would accomplish speech reconstruction of its spontaneous speech input if its output were to represent, in flawless, fluent, and content-preserving English, the message that the speaker intended to convey. These cleaner speech tran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2023

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics12081775